Static Versus Dynamic Sampling for Data Mining

نویسندگان

  • George H. John
  • Pat Langley
چکیده

As data warehouses grow to the point where one hundred gigabytes is considered small, the computational efficiency of data-mining algorithms on large databases becomes increasingly important. Using a sample from the database can speed up the datamining process, but this is only acceptable if it does not reduce the quality of the mined knowledge. To this end, we introduce the “Probably Close Enough” criterion to describe the desired properties of a sample. Sampling usually refers to the use of static statistical tests to decide whether a sample is sufficiently similar to the large database, in the absence of any knowledge of the tools the data miner intends to use. We discuss dyrz~mic sampling methods, which take into account the mining tool being used and can thus give better samples. We describe dynamic schemes that observe a mining tool’s performance on training samples of increasing size and use these results to determine when a sample is sufficiently large. We evaluate these sampling methods on data from the UC1 repository and conclude that dynamic sampling is preferable.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Monologism of Hofstede’s Static Model vs Dialogism of Fang’s Dynamic Model: Contradictory Value Configuration of Cultures through the Case Study of Farsi Proverbs

Among various cultural models, the dichotomy of static versus dynamic models has provided a fertile ground for research. Although a number of static models are suggested, the dominant trend in almost all static models is provided by Hofstede who focuses on cultural differences along four major dimensions (power distance, individualism, uncertainty avoidance, and masculinity) and reduces “the co...

متن کامل

Effect of Micro-Structure on Fatigue Behavior of Intact Rocks under Completely Reversed Loading

Rock formations and structures can be subjected to both static and dynamic loadings. Static loadings resulting from different sources such as gravity and tectonic forces and dynamic forces are intermittently transmitted via vibrations of the earth’s crust, through major earthquakes, rock bursts, rock blasting and drilling and also, traffic. Reaction of rocks to cyclic and repetitive stresses re...

متن کامل

Comparison of pseudo-static, Newmark and dynamic response analysis of the final pit wall of Sungun copper mine

Sungun Copper Mine is located in an area with a high level of seismic hazard. Most recently, the Ahar-Varzeqan earthquake with a magnitude of 6.2 on Richter scale occurred on August 11, 2012; at a distance about 40 kilometers away from the mine. Nevertheless, the seismic stability of the final pit wall has not been comprehensively reviewed. In this research, the southwestern wall of the final p...

متن کامل

Effect of Active Dynamic Versus Passive Static Stretching on Hamstring Muscle Tightness in Healthy Female Students: A Randomized Trial Study

Background: For decades, static stretching has been the standard benchmark for training programs, because it has been shown to increase flexibility compared with other methods of stretching. Objective: The current study investigated and compared the effects of active dynamic stretching and passive static stretching on hamstring tightness. Me...

متن کامل

30th International Conference on Ground Control in Mining

Blasting operations generate seismic effects in underground mines. These effects apply additional dynamic loads on the support system, which should bear both static and dynamic loads. Static loads are caused by the weight of the superincumbent strata, while dynamic loads occur as a result of blasting in the mining area. Identification of the origin and determination of the support system behavi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996